A Corpus for Modeling Morpho-Syntactic Agreement in Arabic: Gender, Number and Rationality
نویسندگان
چکیده
We present an enriched version of the Penn Arabic Treebank (Maamouri et al., 2004), where latent features necessary for modeling morpho-syntactic agreement in Arabic are manually annotated. We describe our process for efficient annotation, and present the first quantitative analysis of Arabic morphosyntactic phenomena.
منابع مشابه
A Class-Based Agreement Model for Generating Accurately Inflected Translations
When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes ...
متن کاملIdentifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text
Arabic morphology is complex, partly because of its richness, and partly because of common irregular word forms, such as broken plurals (which resemble singular nouns), and nouns with irregular gender (feminine nouns that look masculine and vice versa). In addition, Arabic morphosyntactic agreement interacts with the lexical semantic feature of rationality, which has no morphological realizatio...
متن کاملMorpho-syntactic processing of Arabic plurals after aphasia: dissecting lexical meaning from morpho-syntax within word boundaries.
Within the domain of inflectional morpho-syntax, differential processing of regular and irregular forms has been found in healthy speakers and in aphasia. One view assumes that irregular forms are retrieved as full entities, while regular forms are compiled on-line. An alternative view holds that a single mechanism oversees regular and irregular forms. Arabic offers an opportunity to study this...
متن کاملPortable Language Technology: a Resource-light Approach to Morpho-syntactic Tagging
Morpho-syntactic tagging is the process of assigning part of speech (POS), case, number, gender, and other morphological information to each word in a corpus. Morpho-syntactic tagging is an important step in natural language processing. Corpora that have been morphologically tagged are very useful both for linguistic research, e.g. finding instances or frequencies of particular constructions in...
متن کاملA Study on Morpho-Syntactic Patterns: A Cohesive Device in Some Persian Live Sport Radio and TV Talks
Morpho-syntactic patterns device encompasses a subcategory of the cohesive devices that assists hearers to have an adequate mental representation for understanding speech. This article investigates the morpho-syntactic patterns employed in some Persian live sport radio and TV programs adapting Dooley and Levinsohn’s theoretical and analytical framework. The research data includes around 30,000 ...
متن کامل